Skip to content

feat: Rust recipe runner integration with engine selection#2951

Merged
rysweet merged 7 commits intomainfrom
feature/rust-recipe-runner-integration
Mar 8, 2026
Merged

feat: Rust recipe runner integration with engine selection#2951
rysweet merged 7 commits intomainfrom
feature/rust-recipe-runner-integration

Conversation

@rysweet
Copy link
Copy Markdown
Owner

@rysweet rysweet commented Mar 8, 2026

Rust Recipe Runner Integration

Integrates the standalone Rust recipe runner into amplihack with automatic engine selection and startup dependency management.

What's included

  • src/amplihack/recipes/rust_runner.py — Binary wrapper with find_rust_binary(), ensure_rust_recipe_runner(), and run_recipe_via_rust(). Raises RustRunnerNotFoundError when the Rust engine is explicitly selected but the binary is missing (no silent fallback).
  • src/amplihack/recipes/__init__.py — Engine selection via RECIPE_RUNNER_ENGINE env var:
    • rust → Rust binary only (fails if not installed)
    • python → Python runner only
    • (not set) → auto-detect (Rust if binary found, Python otherwise)
  • src/amplihack/install.py — Step 6.5 automatically installs recipe-runner-rs during amplihack install if cargo is available
  • tests/recipes/test_rust_runner.py — 26 tests covering binary discovery, execution, JSON parsing, engine selection, and the ensure flow
  • docs/recipes/README.md — Documents engine selection table and auto-install
  • .gitignore — Excludes recipe runner checkout directory

Design Principles

  • No fallback — when RECIPE_RUNNER_ENGINE=rust, the binary must exist or execution fails with a clear error
  • Auto-detect is a convenience — if neither env var is set, the system picks the best available engine and logs which one was selected
  • Best-effort installensure_rust_recipe_runner() tries cargo install --git but doesn't block startup if it fails

Testing

uv run pytest tests/recipes/test_rust_runner.py -v  # 26 tests

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 8, 2026

🤖 Auto-fixed version bump

The version in pyproject.toml has been automatically bumped to the next patch version.

If you need a minor or major version bump instead, please update pyproject.toml manually and push the change.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 8, 2026

Repo Guardian - Passed

All files are durable repository content

Reviewed 6 changed files:

  • .gitignore - Repository configuration
  • docs/recipes/README.md - Feature documentation
  • pyproject.toml - Version metadata
  • src/amplihack/recipes/__init__.py - Engine selection logic
  • src/amplihack/recipes/rust_runner.py - Rust integration module
  • tests/recipes/test_rust_runner.py - Unit tests

No ephemeral content detected (no meeting notes, temporary scripts, or point-in-time documents).

AI generated by Repo Guardian

@rysweet rysweet force-pushed the feature/rust-recipe-runner-integration branch from 1c447a3 to 902cfa8 Compare March 8, 2026 03:44
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 8, 2026

🤖 Auto-fixed version bump

The version in pyproject.toml has been automatically bumped to the next patch version.

If you need a minor or major version bump instead, please update pyproject.toml manually and push the change.

@rysweet rysweet force-pushed the feature/rust-recipe-runner-integration branch from c5dd179 to c0d3e13 Compare March 8, 2026 03:45
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 8, 2026

🤖 Auto-fixed version bump

The version in pyproject.toml has been automatically bumped to the next patch version.

If you need a minor or major version bump instead, please update pyproject.toml manually and push the change.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 8, 2026

🤖 PR Triage Complete

Risk Level: Medium-High (6.5/10)
Priority: Medium
Status: ⚠️ Requires attention before merge


📊 Summary

This PR integrates a Rust recipe runner as an alternative execution engine with automatic selection and graceful fallback to Python.

Changes:

  • 626 additions, 4 deletions across 7 files
  • New rust_runner.py module with binary discovery and execution
  • Engine selection via RECIPE_RUNNER_ENGINE environment variable
  • Auto-install during amplihack install when cargo available
  • 26 comprehensive tests

⚠️ Critical Issues

  1. Merge Conflicts - mergeable_state: dirty - must be resolved before merging
  2. Cross-platform validation needed - Binary discovery may behave differently on Windows/macOS/Linux
  3. CI/CD impact - Rust toolchain availability in CI environments not validated

🎯 Strengths

  • ✅ Excellent test coverage (26 tests)
  • ✅ Clear error handling (no silent fallback when RECIPE_RUNNER_ENGINE=rust)
  • ✅ Backwards compatible (defaults to Python)
  • ✅ Well-documented engine selection
  • ✅ Graceful degradation pattern

🔍 Concerns

  • External dependency - Adds Rust toolchain requirement (optional but increases complexity)
  • Installation time - Auto-install may slow amplihack install process
  • Attack surface - New binary dependency requires security review
  • Platform coverage - Cross-platform binary discovery needs validation

📋 Recommended Actions

Before Merge:

  1. CRITICAL: Resolve merge conflicts
  2. ✅ Test on Windows, macOS, Linux to validate binary discovery
  3. ✅ Verify CI behavior when Rust unavailable
  4. ✅ Validate install.py handles cargo failures gracefully
  5. ⚙️ Consider performance benchmarks (Rust vs Python engine)

Post-Merge:

  • Monitor installation metrics for slowdowns
  • Collect engine selection telemetry to understand adoption
  • Document recommended environments for Rust engine

🏷️ Labels Applied

  • triage:complete
  • triage:needs-testing
  • triage:medium-risk

Triaged by PR Triage Agent on 2026-03-08T12:57:00Z

AI generated by PR Triage Agent

rysweet pushed a commit that referenced this pull request Mar 8, 2026
PR-M1: Split run_recipe_via_rust into focused helpers
PR-M2: Configurable timeouts via env vars
PR-M3: Remove point-in-time Python references in docs
PR-M4: Remove hardcoded counts from docs
PR-M5: Add tests for empty results and exception paths
PR-L1: Redact context values in log output
PR-L2: Lazy binary search path evaluation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 8, 2026

🟡 Triage Result: NEEDS CONFLICT RESOLUTION

Priority: MEDIUM-HIGH | Risk: HIGH

Assessment

High-value Rust recipe runner integration with manageable scope, but merge conflicts prevent automated merging.

Stats:

  • 9 files changed (+883/-13)
  • 6 commits, 15.5 hours old
  • Recent activity shows active development

Blockers

Merge conflicts - must be resolved before review

Recommended Action

  1. Resolve merge conflicts via rebase:

    git checkout feature/rust-recipe-runner-integration
    git fetch origin
    git rebase origin/main
    # Resolve conflicts
    git push --force-with-lease
  2. Manual review after conflicts resolved due to integration complexity

Why This Matters

Recipe runner integration is foundational infrastructure. Clean merge critical for:

  • Engine selection logic correctness
  • Startup dependency management
  • Python/Rust compatibility

Related Issues

  • See triage issue tracking merge conflict epidemic
  • Part of 75% PR conflict rate (3 of 4 PRs)

Automated triage by PR Triage Agent - Run #22827330377

AI generated by PR Triage Agent

Ubuntu and others added 6 commits March 8, 2026 20:46
Adds the Rust recipe runner binary integration with automatic engine
selection and startup dependency management.

- src/amplihack/recipes/rust_runner.py: Binary wrapper with find, ensure,
  and execute functions. RustRunnerNotFoundError for explicit failures.
  ensure_rust_recipe_runner() auto-installs via cargo if binary is missing.
- src/amplihack/recipes/__init__.py: Engine selection via RECIPE_RUNNER_ENGINE
  env var (rust/python/auto-detect). Exports ensure_rust_recipe_runner.
- src/amplihack/install.py: Step 6.5 ensures binary during amplihack install.
- tests/recipes/test_rust_runner.py: 26 tests covering discovery, execution,
  engine selection, and ensure flow.
- docs/recipes/README.md: Documents engine selection and auto-install.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Validate RECIPE_RUNNER_ENGINE values (raise ValueError on unknown)
- Add non-interactive footer to NestedSessionAdapter
- Add session depth tracking to NestedSessionAdapter

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PR-M1: Split run_recipe_via_rust into focused helpers
PR-M2: Configurable timeouts via env vars
PR-M3: Remove point-in-time Python references in docs
PR-M4: Remove hardcoded counts from docs
PR-M5: Add tests for empty results and exception paths
PR-L1: Redact context values in log output
PR-L2: Lazy binary search path evaluation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…th (C2-PR-1, C2-PR-2, C2-PR-6, C2-PR-9, C2-PR-10)

C2-PR-1: Raise ValueError on invalid RECIPE_RUNNER_ENGINE values
C2-PR-2: Log full traceback for ensure_rust_recipe_runner failures
C2-PR-6: Enforce AMPLIHACK_MAX_DEPTH in execute_agent_step
C2-PR-9: Add test for invalid engine value validation
C2-PR-10: Add test for execution timeout propagation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…g_dir (C2-INT-3/4/5/6/7/10)

C2-INT-3: Serialize Duration as f64 seconds (Rust repo)
C2-INT-4/5/6: Document Rust-only features in engine comparison table
C2-INT-7: Document all environment variables
C2-INT-10: Resolve working_dir to absolute path to prevent double-application

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ority test (C3-PR-1/2/3)

C3-PR-1: Print warning on install exception (was silent)
C3-PR-2: Return resolved path from find_rust_binary for env var path
C3-PR-3: Fix false-confidence test with discriminating mock

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@rysweet rysweet force-pushed the feature/rust-recipe-runner-integration branch from 33e5b79 to 16928c8 Compare March 8, 2026 20:49
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 8, 2026

🤖 Auto-fixed version bump

The version in pyproject.toml has been automatically bumped to the next patch version.

If you need a minor or major version bump instead, please update pyproject.toml manually and push the change.

@rysweet rysweet merged commit 63409a0 into main Mar 8, 2026
1 check passed
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 8, 2026

Repo Guardian - Passed

All files are durable repository content

Reviewed 8 changed files:

  • .gitignore - Repository configuration
  • docs/recipes/README.md - Feature documentation
  • pyproject.toml - Version metadata
  • src/amplihack/install.py - Installation logic
  • src/amplihack/recipes/__init__.py - Engine selection logic
  • src/amplihack/recipes/adapters/nested_session.py - Adapter module
  • src/amplihack/recipes/rust_runner.py - Rust integration module
  • tests/recipes/test_rust_runner.py - Unit tests

No ephemeral content detected (no meeting notes, temporary scripts, or point-in-time documents).

AI generated by Repo Guardian

AI generated by Repo Guardian

rysweet added a commit that referenced this pull request Mar 27, 2026
Replaces Python recipe runner with Rust implementation from rysweet/amplihack-recipe-runner.

- Engine selection via RECIPE_RUNNER_ENGINE env var (rust/python/auto-detect)
- Auto-installs via cargo on first use (ensure_rust_recipe_runner)
- Nested session depth enforcement (AMPLIHACK_MAX_DEPTH)
- Non-interactive footer for autonomous agent execution
- Configurable timeouts via env vars
- Context value redaction in logs
- 34 tests covering all paths

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
rysweet added a commit that referenced this pull request Mar 27, 2026
… on new repos without origin/main. ## Problem default-workflow currently assumes that origin/main already exists when it reaches s (#3620)

* fix: agent resolver now handles 3-part refs (namespace:category:name) (#2856)

Recipes use 3-part agent references like 'amplihack:core:architect' but
the resolver only handled 2-part 'amplihack:architect'. The split on the
first colon left 'core:architect' as the name, which failed the safety
regex. All recipe steps using amplihack:core:* or amplihack:specialized:*
silently lost their agent system prompts.

Now parses 2-part and 3-part refs correctly, validates each segment
independently, and rejects 4+ parts.

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add continue-on-error to agentic workflow discussion steps (#2853)

* fix: add continue-on-error to agentic workflow discussion creation steps

Four agentic workflows fail when GitHub Discussions categories aren't
available: daily-code-metrics, weekly-issue-summary, issue-classifier,
and repo-guardian. Adding continue-on-error: true to the "Process Safe
Outputs" step lets them degrade gracefully instead of failing the run.

Closes #2749, closes #2790, closes #2745, closes #2756

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix: add pre-flight check for claude CLI binary in AutoMode (#2854)

* fix: add pre-flight check for claude CLI binary in AutoMode

When the claude binary is missing, the Claude Agent SDK hangs or fails
silently. This adds:
1. Pre-flight shutil.which("claude") check before SDK initialization
2. Actionable error messages pointing to installation instructions
   when binary/process errors are detected at runtime

Closes #2769

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix(drift): only fail on CHANGED files, treat MISSING/EXTRA as warnings (#2857)

* fix(drift): only fail on CHANGED files, treat MISSING/EXTRA as warnings

Resolves the CI failures caused by check_drift.py exiting non-zero
for every PR due to 434 MISSING and 83 EXTRA files that represent
intentional structural differences between .claude/ and amplifier-bundle/.

Changes:
- scripts/check_drift.py: MISSING/EXTRA → warnings (exit 0), CHANGED → errors (exit 1)
- Sync 11 CHANGED files from .claude/skills/ to amplifier-bundle/skills/
- Sync 28 CHANGED files from .claude/skills/ to docs/claude/skills/

After this change check_drift.py exits 0 on the current repo state.
Future content drift (CHANGED) will still cause CI failure.

Closes #2820 follow-up

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* docs: streamline Claude CLI installation in PREREQUISITES.md (#2719)

- Consolidate 4 installation methods into a clear platform table
- Remove deprecated npm instructions (kept deprecation note)
- Remove redundant/conflicting information
- Simplify auto-installation section

Fixes #2371

Co-authored-by: voidborne-d <voidborne-d@users.noreply.github.com>

* docs: add reading guide to README and fix broken anchor (#2858)

* docs: add reading guide to README and fix broken #feature-catalog anchor

Adds a quick navigation guide after the install command. Also fixes
the pre-existing broken ToC link: #feature-catalog -> #features
(the actual heading is "## Features", not "## Feature Catalog").

Based on PR #2836 with anchor fix applied.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* [docs] Update documentation for merged PRs from March 2-3, 2026 (#2823)

* Update documentation for merged PRs from March 2-3, 2026

Updates documentation following Diátaxis framework for 4 merged PRs:

## Recipe Runner Updates

### Recipe Discovery (PR #2813)
- Document installed package path support in discovery.py
- Update priority order in README and troubleshooting guide
- Add v0.9.0 feature callout for pip install support
- Explain absolute path resolution via Path(__file__)

### Bash Timeouts (PR #2807)
- Add timeout field to step fields table
- Document new "Bash Step Timeouts" section
- Clarify default behavior: no timeout (None)
- Show optional timeout configuration examples

### Adapter Auto-Detection (PR #2804)
- Document get_adapter() usage in new reference doc
- Explain NestedSessionAdapter selection for CLAUDECODE env
- Note heredoc quoting fixes and condition eval improvements

## Skills System Updates

### Skill Frontmatter (PR #2811)
- Add "YAML Frontmatter Requirements" section to SKILL_CATALOG.md
- Document common mistakes fixed in v0.9.0
- Show correct vs incorrect frontmatter examples
- List critical requirements for frontmatter validation

## New Reference Document

Created docs/recipes/RECENT_FIXES_MARCH_2026.md consolidating:
- All 4 PR fixes with technical details
- Root cause analysis for each issue
- Solutions and impact statements
- Test verification and documentation references

Follows Diátaxis framework:
- Reference: Technical specifications for discovery, timeouts, frontmatter
- How-to: Troubleshooting guides for common issues
- Explanation: Why changes were needed and their impact

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* docs: add missing CWD-relative paths to recipe discovery priority list

The recipe discovery docs listed 4 search paths but the code has 6.
Added the 2 CWD-relative legacy paths (amplifier-bundle/recipes/ and
src/amplihack/amplifier-bundle/recipes/) so the docs match the actual
discovery.py implementation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: GitHub Actions Bot <noreply@github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>

* feat: persist NODE_OPTIONS memory config preference to ~/.amplihack/config (#2860)

* feat: persist NODE_OPTIONS memory config preference to ~/.amplihack/config

First run prompts the user for NODE_OPTIONS consent and saves the answer
to ~/.amplihack/config (JSON). Subsequent runs skip the prompt and emit
an informational message showing the saved setting and the config file
path so users know how to change it.

Changes:
- memory_config.py: add get_config_path, load_user_preference,
  save_user_preference; update get_memory_config to load saved pref and
  skip prompt for returning users; update display_memory_config to emit
  info message (with config path) for returning users
- test_memory_config.py: add TestConfigPersistence (8 tests) and
  TestFirstRunVsReturningUser (7 tests) covering first-run and
  returning-user code paths; all 67 tests pass

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix: quality-audit must fix ALL findings per cycle, add structured inputs (#2842, #2843) (#2861)

Addresses two issues with the quality-audit recipe:

- #2842: Each cycle must fix ALL confirmed findings before moving to the next.
  Added fix-all-per-cycle enforcement rule, verify-fixes bash step that compares
  confirmed findings against fix results, and updated recurse-decision to check
  for NEW findings rather than old unfixed ones.

- #2843: Added structured inputs (severity_threshold, module_loc_limit,
  fix_all_per_cycle, categories) to the recipe context so audits are
  configurable and reproducible without modifying the recipe file.

Changes:
- quality-audit-cycle.yaml: v3.0.0 → v4.0.0, new context variables, verify-fixes
  step, strengthened fix step prompt, updated recurse-decision logic
- SKILL.md: v3.0 → v4.0, documented fix-all rule, structured inputs table,
  fix verification step, loop decision based on new findings
- 27 outside-in tests covering structured inputs, fix-all enforcement, verify
  logic, recipe version, and skill documentation

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add RECIPE step type for sub-recipe composition (#2821) (#2862)

* feat: add RECIPE step type for sub-recipe composition (issue #2821)

Adds the ability to invoke sub-recipes as steps within a recipe YAML file,
enabling workflow composition and reuse.

Changes:
- models.py: Add StepType.RECIPE enum member; add `recipe` and `sub_context`
  fields to the Step dataclass
- parser.py: Parse `type: recipe` steps from YAML (reads `recipe` as sub-recipe
  name and `context` as dict merged into sub-recipe); add `recipe`/`context` to
  known step fields; infer StepType.RECIPE when `recipe` field present; validate
  that recipe steps have a `recipe` field
- runner.py: Import find_recipe at module level; add MAX_RECIPE_DEPTH=3 constant;
  add `_depth` parameter to RecipeRunner; add _execute_sub_recipe() that checks
  depth guard, merges context, and delegates to a child RecipeRunner; route
  StepType.RECIPE in _dispatch_step()
- tests: 19 new unit tests in test_recipe_step_type.py covering enum, dataclass
  fields, parser, happy-path execution, context merging, failure propagation,
  recursion depth guard, and dry-run
- docs: Update Step Fields table in docs/recipes/README.md with `recipe`,
  `context`, and `output` fields; add Recipe Step section with YAML example

Closes #2821

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* docs: fix README/CONTRIBUTING gaps — session explanation, install order, /dev intro, uv sync (#2865)

- #2777: Add explanation of what the interactive session does after install
- #2778: Clarify that uv sync installs all deps into local .venv
- #2780: Add "install prerequisites first" before install options
- #2781: Expand first /dev mention with context about what it does

#2782 already addressed (cost info exists). #2783, #2784 closed by owner.

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: distributed hive mind — federation, LearningAgent eval, retrieval pipeline (#2717)

feat: distributed hive mind — federation, retrieval pipeline, CRDTs, gossip, eval framework

* fix: improve ADO skill auto-activation with keyword-rich descriptions (#2868)

- azure-devops: description now includes ADO, work items, user stories, bugs,
  sprints, builds, releases, Azure DevOps URLs
- azure-devops-cli: added auto_activate_keywords for az devops, az pipelines, etc.

Closes #2850

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add transcript-viewer skill for JSONL log reading (#2869)

* feat: add transcript-viewer skill for JSONL log reading (#2445)

Adds a new Claude Code skill that wraps claude-code-log CLI to convert
and browse JSONL session transcripts as HTML or Markdown.

Features:
- Current session mode: views the most recently modified JSONL transcript
- Specific session mode: looks up a session by ID
- Agent output mode: views .agent-step-*.log background task files
- All sessions mode: lists and browses all project sessions with date-range filtering
- Graceful degradation: clear install instructions when claude-code-log is missing
- HTML or Markdown output format

Includes 18 tests validating all core behaviors (tool detection, mode routing,
date filtering, JSONL parsing, YAML frontmatter structure).

Closes #2445

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

* feat: add GitHub Copilot CLI support to transcript-viewer skill (#2445)

Research shows Copilot CLI has no automatic log persistence (unlike Claude Code's
~/.claude/projects/*.jsonl). Sessions are exported manually via `/share markdown`.

Changes:
- Version bumped to 1.1.0
- Auto-detects log source: JSONL (Claude Code) vs markdown (Copilot /share export)
- Context detection uses launcher_detector.py env vars (CLAUDE_CODE_SESSION,
  GITHUB_COPILOT_TOKEN, COPILOT_SESSION) — defaults to claude-code as safe fallback
- Copilot guidance: when in Copilot context with no file, shows /share instructions
- Format detection guards against false-positives (no /share in regex)
- GITHUB_TOKEN excluded from Copilot markers (too generic for CI environments)
- 6 new tests (tests 12-18): format auto-detection, context detection, markdown parsing,
  false-positive guard, unknown format, plain-log detection
- 34/34 tests pass

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: sync quality-audit/SKILL.md drift between source and bundle/docs copies

The drift detection CI was failing because quality-audit/SKILL.md had diverged:
- Source (.claude/skills): v4.0 with fix-all-per-cycle rule (#2842)
- amplifier-bundle/skills: v3.0 (stale copy)
- docs/claude/skills: v3.0 (stale copy)

Copied source of truth to both locations to resolve the CHANGED drift errors.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(transcript-viewer): support GitHub Copilot CLI JSONL logs instead of markdown exports

Copilot CLI auto-saves sessions to ~/.copilot/session-state/*/events.jsonl
(JSONL format, not markdown exports). Update the skill to use the correct paths.

Changes:
- Fix Copilot log path to ~/.copilot/session-state/*/events.jsonl
- Add directory-based auto-detection (checks ~/.copilot/session-state/ and
  ~/.claude/projects/ for sessions before falling back to env vars)
- Mode 1: read latest events.jsonl from most recent ~/.copilot/session-state/*/ dir
- Mode 2: look up session by directory ID in ~/.copilot/session-state/
- Mode 4: list session dirs in ~/.copilot/session-state/ showing session IDs
- Remove incorrect "no auto-save" guidance and /share markdown instructions
- Document workspace.yaml, plan.md, and checkpoints/ in session structure
- Update tests: replace Copilot markdown tests with Copilot JSONL path tests
  (20 test groups, 44 assertions — all passing)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add transcript-viewer documentation and sync to bundle/docs

- Created docs/claude/skills/transcript-viewer/README.md with usage guide
- Synced SKILL.md to amplifier-bundle/ and docs/ (drift prevention)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix(quality-audit-cycle): prevent JSON from being interpreted as bash commands in verify-fixes (#2886)

* fix(quality-audit-cycle): prevent JSON agent output from being interpreted as bash commands in verify-fixes step

The verify-fixes step used python3 -c "..." with double-quoting. When the
template variables {{validated_findings}} and {{fix_results}} were expanded
into $VALIDATED and $FIX_RESULTS via bash variable substitution inside the
double-quoted python3 -c string, any double-quotes in the JSON (all JSON
keys/strings) terminated the outer bash double-quoted argument prematurely.
This caused bash to interpret JSON content (e.g. "json:", "cycle:", "validated:")
as shell commands, producing errors like:
  /bin/bash: line 17: json: command not found
  /bin/bash: line 19: cycle:: command not found

Fix: export the bash variables and use a single-quoted heredoc (<<'PYEOF')
so Python reads the JSON from os.environ instead of string interpolation.
This completely isolates the JSON content from bash string parsing.

Closes #2872

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix: remove redundant single quotes from quality-audit-cycle bash templates (#2887)

render_shell() already applies shlex.quote() to template variables.
The manual single quotes in the YAML ('{{var}}') caused double-quoting
that broke bash — JSON output was interpreted as commands.

Removed single quotes from all bash step template variable assignments
in the verify-fixes, update-history, and decide-continue steps.

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: split power_steering_checker.py into modular package with Copilot support (#2845) (#2910)

* refactor: split power_steering_checker.py 5063 LOC into 5 modules (issue #2845)

- Extract considerations.py: ConsiderationAnalysis, PowerSteeringResult, CheckerResult, PowerSteeringRedirect dataclasses and analysis logic
- Extract sdk_calls.py: Claude SDK interaction layer with configurable _timeout constant
- Extract progress_tracking.py: progress thresholds as configurable MODULE_PROGRESS_THRESHOLD and OVERALL_PROGRESS_THRESHOLD constants
- Extract result_formatting.py: output formatting and display utilities
- Extract main_checker.py: PowerSteeringChecker orchestrator, check_session, is_disabled entry points
- Add __init__.py with backward-compatible re-exports (all public symbols preserved)
- Fix broad except Exception blocks to log at WARNING with exc_info=True
- Make hardcoded timeouts (SDK_TIMEOUT=30) and thresholds configurable module-level constants
- Add comprehensive unit tests for each module (test_psc_*.py)
- Add architecture documentation

Closes: part of #2845

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address security and code quality issues from PR #2872 review

Security (blocking):
- SEC-1: Add _validate_session_id() guard in _log_violation() before
  path construction (main_checker.py) — prevents path traversal via
  crafted session_id like '../../etc/x'
- SEC-2: Add _validate_session_id() guard in _write_summary() before
  path construction (progress_tracking.py) — same class of bug

Code quality:
- Move `import sys` from inside _write_with_retry() to module level
  (progress_tracking.py)
- Remove redundant `import sys` inside except block in sdk_calls.py;
  use `_sys` already imported at line 474
- Add exc_info=True to _save_redirect() ERROR log (progress_tracking.py)
- Guard sys.path.insert() with `if _hook_dir not in sys.path` to prevent
  duplicate entries on repeated imports (main_checker.py)

Requirements deviation:
- Add analyze_consideration to __all__ in __init__.py; remove # noqa: F401

All 154 PSC tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: add missing re-exports and sync .claude/ copy for power_steering_checker

Outside-in testing found 2 issues with the power_steering_checker refactor:

1. __init__.py was missing 5 symbols: get_shared_runtime_dir,
   _write_with_retry, MAX_TRANSCRIPT_LINES, CHECKER_TIMEOUT,
   PARALLEL_TIMEOUT. Test mock.patch() calls targeting these would fail.

2. .claude/tools/ still had the old 5200-line monolith while
   amplifier-bundle/ had the new package. Replaced monolith with
   package copy to prevent divergence.

Note: considerations.py (2423 lines) still needs further splitting
but that's a separate task.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: split considerations.py into 6 focused modules (issue #2845)

Split ConsiderationsMixin (2423 lines) into focused sub-modules:
- session_detection.py: SessionDetectionMixin (8 methods + constants)
- transcript_helpers.py: TranscriptHelpersMixin (6 methods)
- checks_workflow.py: ChecksWorkflowMixin (6 methods + patterns)
- checks_quality.py: ChecksQualityMixin (9 methods + constants)
- checks_docs.py: ChecksDocsMixin (7 methods + constants)
- checks_ci_pr.py: ChecksCiPrMixin (7 methods + pattern)

considerations.py now retains only dataclasses (CheckerResult,
ConsiderationAnalysis, PowerSteeringRedirect, PowerSteeringResult),
the _env_int helper, and ConsiderationsMixin as a shell that inherits
from all 6 focused mixins with PHASE1_CONSIDERATIONS.

Updated __init__.py to re-export all new mixin classes. Synced
.claude/ copy with all refactored files.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: update test_power_steering_worktree.py for refactored package structure

- Update mock.patch targets from power_steering_checker.get_shared_runtime_dir
  to power_steering_checker.main_checker.get_shared_runtime_dir (function now
  lives in main_checker submodule, not top-level namespace)
- Fix .disabled file creation paths: current _is_disabled() checks
  shared_runtime/power-steering/.disabled, not shared_runtime/.disabled
  (tests were written against an older implementation)

All 26 worktree integration tests now pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: align mixin method behavior with original monolith for 6 failing tests

Fix 6 failing tests in power_steering_checker refactor (issue #2845):

1. _check_agent_unnecessary_questions (checks_quality.py):
   Revert to counting question marks in assistant text instead of
   AskUserQuestion tool calls, matching original behavior.

2. _check_documentation_updates (checks_docs.py):
   Revert to checking all code file modifications (not just public-facing
   paths), so any code change without docs update triggers the check.

3. _check_next_steps (checks_workflow.py):
   Revert to simple keyword matching (next steps, todo, pending, etc.)
   instead of requiring structured bulleted lists with regex patterns.

4. _check_review_responses (checks_ci_pr.py):
   Revert to checking user messages for review-related keywords instead
   of requiring concrete PR review CLI commands.

5. _check_unrelated_changes (checks_quality.py):
   Revert to simple file count heuristic (>20 files = scope creep)
   instead of top-level directory counting (which failed for absolute
   paths outside project root).

6. test_redirect_saved_on_block_decision (main_checker.py):
   Add deterministic override step (5a) after _analyze_considerations:
   run heuristics for satisfied considerations and override with False
   if heuristic detects concrete failure. This prevents SDK fail-open
   from masking real failures like incomplete TODOs. Preserves SDK-first
   architecture in _check_single_consideration_async.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(power_steering_checker): address 3 confirmed findings from quality audit

Security/reliability/dead-code audit of all 12 modules in the refactored
power_steering_checker package (PR #2872). Three confirmed findings fixed:

1. Dead code (MEDIUM) — checks_quality.py:408
   Redundant `import re as _re` inside _check_interactive_testing method
   body; `re` is already imported at module level (line 1). Removed the
   late import and replaced _re.search / _re.IGNORECASE with re.search /
   re.IGNORECASE.

2. Dead code (MEDIUM) — checks_docs.py:181-185
   Unreachable branch in _check_feature_docs_discoverable: edge case 2
   had condition `and not new_features` which is always False at that
   point because edge case 1 already returned True when not new_features.
   The entire dead block was removed (behaviour unchanged — the early
   return in edge case 1 already covers the empty-features scenario;
   with feature definitions present we must not skip discoverability).

3. Silent fallback (MEDIUM) — main_checker.py:153-156
   `except OSError: pass` in PowerSteeringChecker.__init__ swallowed
   runtime-dir creation failures with no log output. Changed to capture
   the exception and emit a stdlib logger.warning for observability while
   preserving the fail-open behaviour.

All 83 tests pass (1 skipped), 0 failures.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(power_steering_checker): add Copilot transcript format auto-detection and parsing

Adds transcript_parser.py with:
- detect_transcript_format(): inspects first JSONL line to identify Claude Code
  vs GitHub Copilot CLI events.jsonl format (flat role-based or event-based)
- parse_copilot_transcript(): normalizes Copilot events into the same list[dict]
  shape checker methods expect (type, message.role, message.content, timestamp,
  sessionId)
- parse_claude_code_transcript(): existing passthrough behavior, no normalization
- parse_transcript(): auto-detect + dispatch entry point

Updates _load_transcript in main_checker.py to use parse_transcript(), so both
Claude Code JSONL and Copilot events.jsonl are processed transparently.

Adds 48 tests in test_transcript_parser.py covering format detection, event
normalization, oversized-line safety, and _load_transcript integration.

Existing Claude Code transcript parsing is unchanged (same raw dicts returned).

Closes #2845

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(power_steering): add Copilot e2e tests and sync transcript_parser to .claude/ (#2845)

Extends the Copilot transcript format support added in the preceding commit
by syncing the transcript_parser module and _load_transcript integration
into the .claude/ package copy, and adding a comprehensive e2e test suite
with a realistic Copilot-format fixture.

Changes:
- Sync .claude/ power_steering_checker package with amplifier-bundle:
  - Add transcript_parser.py (copied from amplifier-bundle — Copilot format
    detection and normalization)
  - Update main_checker.py _load_transcript() to use parse_transcript():
    auto-detects Claude Code vs Copilot CLI format, normalizes Copilot events
    into canonical list[dict] shape, logs format detection for observability

- Add tests/fixtures/copilot_events.jsonl (both .claude/ and amplifier-bundle):
  Realistic 20-line Copilot CLI session: conversation_start, user message,
  assistant messages with tool_call/tool_result events for file writes,
  test execution, git commit, final assistant summary, conversation_end

- Add tests/test_copilot_e2e_power_steering.py (22 tests):
  Full end-to-end pipeline coverage not present in test_transcript_parser.py:
  - Format detection: claude_code, copilot flat, copilot event, empty, fixture
  - Normalization: user/assistant/tool_call/tool_result, flat format, unchanged
    Claude Code passthrough
  - check_session() on raw Copilot JSONL (auto-detect path)
  - check_session() on pre-normalized Copilot transcript
  - detect_session_type() on normalized Copilot messages
  - result.decision in ["approve", "block"], result.reasons is list[str]
  - Empty session handled gracefully
  - Edge cases: unknown role skipped, JSON-string arguments decoded,
    interleaved turns, malformed JSON line skipped (fail-open)

All 22 new tests pass; 48 amplifier-bundle transcript_parser tests pass;
30 existing power_steering_checker tests pass (1 skipped, pre-existing).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

* fix(transcript_parser): support real Copilot dotted event type format

The Copilot transcript parser was broken against real Copilot session data.
Real Copilot sessions use dotted type names (user.message, assistant.message,
session.start) with content nested under a 'data' sub-object, not the fake
event-based format the fixtures previously used.

Changes:
- detect_transcript_format(): detect dotted type names (user.message,
  assistant.message, session.start etc.) as 'copilot' format
- normalize_copilot_event(): handle user.message (data.content) and
  assistant.message (data.content, data.toolRequests); skip all lifecycle
  events (session.start, session.model_change, assistant.turn_start,
  assistant.turn_end, session.shutdown)
- Replace copilot_events.jsonl fixture with real data from
  ~/.copilot/session-state/b1cc7005-4c26-46d7-a9e5-ebc5a882be65/events.jsonl
- Add 17 new tests for real Copilot format in test_transcript_parser.py
- Sync all fixes to .claude/ copy (parser, fixture, test file)

Verified: parse_transcript() correctly returns format='copilot' and 2
normalized messages (user + assistant) from the real events.jsonl.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: unset CLAUDECODE env var before claude_agent_sdk.query() calls to prevent nested session errors

When running inside an existing Claude Code session, the CLAUDECODE env
var causes claude_agent_sdk.query() subprocess spawning to fail with:
"Claude Code cannot be launched inside another Claude Code session."

Fix: Add os.environ.pop("CLAUDECODE", None) at module level in all files
that use claude_agent_sdk.query(), matching the pattern already used by
the multitask orchestrator.

Files fixed:
- amplifier-bundle/tools/amplihack/hooks/claude_power_steering.py
- amplifier-bundle/tools/amplihack/hooks/claude_reflection.py
- amplifier-bundle/skills/pm-architect/scripts/triage_pr.py
- amplifier-bundle/skills/pm-architect/scripts/generate_roadmap_review.py
- amplifier-bundle/skills/pm-architect/scripts/generate_daily_status.py
- src/amplihack/launcher/auto_mode.py

Also syncs .claude/ and docs/claude/ copies to match amplifier-bundle.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Update documentation for power_steering_checker refactoring and Copilot CLI support (#2911)

## Changes

### Power-Steering Documentation (docs/features/power-steering/README.md)
- Updated Architecture section to reflect modular 12-file package structure
- Added changelog entry for v0.10.0 (2026-03-07) with refactoring details
- Documented Copilot CLI transcript support
- Highlighted 76% LOC reduction (5,063 → 1,217 lines in largest module)
- Added module responsibility descriptions and cross-references

### API Reference (docs/reference/power-steering-checker-api.md)
- Added refactoring summary to Package Overview section
- Documented 191 tests passing (121 existing + 48 parser + 22 Copilot e2e)
- Noted CLAUDECODE environment variable fix for nested sessions
- Cross-referenced power_steering_checker package README

### Copilot CLI Integration (docs/COPILOT_CLI.md)
- Bumped version to 1.1.0 (from 1.0.0)
- Added new "Copilot CLI Transcript Support" section to ToC
- Documented auto-detection of Claude Code vs Copilot CLI transcript formats
- Explained module structure and testing coverage
- Highlighted benefits for Copilot CLI users (session completion validation)

## Context

Following Diátaxis framework:
- **Explanation**: Architecture changes in Power-Steering docs
- **Reference**: API updates in reference documentation
- **Tutorial**: Usage guidance in Copilot CLI integration docs

Based on merged PRs from 2026-03-06 to 2026-03-07:
- PR #2910: Major refactoring into modular package
- PR #2887: Bash template quoting fix
- PR #2886: JSON handling in quality-audit-cycle

Co-authored-by: Claude Documentation Agent <claude-agent@amplihack.github.io>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: recipe runner nesting — unset CLAUDECODE, tmux execution, auto-install tmux (#2912)

* fix: recipe runner nesting — unset CLAUDECODE, tmux execution, auto-install tmux

Three fixes for recipe runner execution inside Claude Code sessions:

1. ClaudeSDKAdapter: Added os.environ.pop("CLAUDECODE", None) before
   SDK query call. The parent Claude Code sets CLAUDECODE, child
   sessions refuse to start if present. Both adapters now strip it.

2. dev-orchestrator SKILL.md: Updated execution instructions to use
   tmux sessions instead of run_in_background. Claude Code's background
   task manager kills processes after ~10 min (Issue #2909). Recipe
   workstreams can take hours. Instructions now use:
   - tmux new-session -d for detached execution
   - env -u CLAUDECODE for clean env
   - CLISubprocessAdapter for subprocess isolation

3. Install script: Auto-installs tmux if missing (apt-get, brew, dnf).
   tmux is now required for recipe runner execution.

Closes #2909

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@devy.yb0a3bvkdghunmsjr4s3fnfhra.phxx.internal.cloudapp.net>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* feat: power-steering SDK abstraction — auto-select Claude or Copilot SDK (#2917) (#2918)

* feat: power-steering SDK abstraction — auto-select Claude or Copilot SDK (#2917)

Add power_steering_sdk.py that auto-detects the active launcher (Claude
Code or GitHub Copilot CLI) via LauncherDetector and routes LLM queries
to the correct SDK. One code path, two backends.

- New: power_steering_sdk.py with query_llm(prompt, project_root) -> str
- Refactored: claude_power_steering.py — replaced 5 identical 12-line
  SDK call blocks with single-line query_llm() calls (-228 lines)
- Auto-detection: uses existing LauncherDetector from adaptive context
- Fail-open: returns "" if neither SDK available (heuristic fallback)
- Both SDKs support sessions for future optimization

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

* fix: correct Copilot SDK API usage — async methods, proper session lifecycle (#2917)

CopilotClient methods are async coroutines, not sync. Fixed _query_copilot
to await start/create_session/send_and_wait/stop. Extract response text
from event.data.content. Verified against real Copilot SDK (hit version
mismatch on server but API calls resolved correctly).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix: enforce default-workflow for single-workstream tasks (#2928)

* fix: enforce default-workflow for single-workstream tasks in smart-orchestrator

The single-workstream execution paths in smart-orchestrator.yaml were using
`type: agent` with a builder agent and a text prompt that merely *asked* the
agent to follow DEFAULT_WORKFLOW steps 0-22. This provided no enforcement --
the builder agent would skip workflow steps and implement directly.

Changed `execute-single-round-1` and `execute-single-fallback-blocked` from
`type: agent` to `type: recipe` with `recipe: default-workflow`, matching
the multi-workstream path which already uses recipe-based execution via
orchestrator.py. Parent context (task_description, repo_path) flows through
automatically via the runner's context merging in _execute_sub_recipe().

Continuation rounds (execute-round-2, execute-round-3) remain as agent steps
since they are incremental work referencing previous round results.

Fixes #2927

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@devy.yb0a3bvkdghunmsjr4s3fnfhra.phxx.internal.cloudapp.net>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* feat: add skill_invocation power-steering check (from closed #2916) (#2926)

* feat: add skill_invocation power-steering check (extracted from #2916)

Detects when a user requests a skill via slash command (<command-name> tag)
but the agent bypasses it and responds directly without invoking the Skill
tool. This was independently valuable work from closed PR #2916 that got
dropped during the stale PR cleanup.

- checks_workflow.py: Add _check_skill_invocation method
- considerations.yaml: Add skill_invocation blocker for DEV/INVESTIGATION/MAINTENANCE
- 12 outside-in tests (Claude + Copilot + edge cases)

Related to #2914

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix: add OPERATIONS session type for PM/planning sessions (#2914) (#2923)

* fix: add OPERATIONS session type for PM/planning sessions (#2914)

Power-steering incorrectly activated dev checks on Q&A/PM sessions
(e.g. /pm-architect) because Read/Grep tool usage triggered
INVESTIGATION classification, which applies workflow checks.

Add OPERATIONS session type detected via PM/planning keywords
(prioritize, backlog, roadmap, sprint, triage, etc.) that skips
all power-steering considerations — same as SIMPLE sessions.

Detection priority: SIMPLE > OPERATIONS > DEVELOPMENT > INVESTIGATION.

Closes #2914, Closes #2913

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

* test: add OPERATIONS session type detection tests (#2914)

12 unit tests covering:
- PM/planning keyword detection (pm-architect, backlog, roadmap, sprint)
- OPERATIONS skips all considerations
- Env override AMPLIHACK_SESSION_TYPE=OPERATIONS
- Priority: SIMPLE > OPERATIONS > DEVELOPMENT > INVESTIGATION
- Code modifications with operations keywords

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add outside-in e2e tests for OPERATIONS session type (#2914)

8 tests covering both Claude and Copilot sessions:
- Claude /pm-architect → OPERATIONS classification
- Claude full check() flow → approve (no blocks)
- Copilot backlog triage → OPERATIONS classification
- Copilot full check() flow → approve (no blocks)
- Development sessions still get full checks
- Simulates real transcript structure with tool_use blocks

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix: add safety valve to stop hook lock mode (#2874) (#2924)

* fix: add safety valve to stop hook lock mode to prevent infinite loops (#2874)

When lock mode is active and the agent has completed all work, the stop
hook blocks every stop attempt indefinitely — creating an infinite loop
of 100+ empty block cycles consuming API tokens with zero productive work.

Add a safety valve: after AMPLIHACK_MAX_LOCK_ITERATIONS (default 50)
consecutive lock blocks, auto-approve the stop and remove the lock file.
The user is notified via stderr and can re-enable with /amplihack:lock.

The threshold is configurable via environment variable for users who need
longer autonomous sessions.

Closes #2874

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

* test: add safety valve unit tests for stop hook lock mode (#2874)

6 tests covering:
- Normal lock block below threshold
- Safety valve triggers at threshold (50)
- Lock file removal on trigger
- Custom threshold via AMPLIHACK_MAX_LOCK_ITERATIONS
- Below-threshold with custom value
- No lock file approves normally

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add outside-in e2e tests for stop hook safety valve (#2874)

7 tests covering both Claude and Copilot sessions:
- Claude: lock blocks normally, safety valve at default threshold,
  custom threshold, simulated infinite loop scenario
- Copilot: lock blocks, safety valve triggers
- No lock mode: unaffected (still approves normally)

Simulated infinite loop test reproduces the exact #2874 scenario:
3 rapid stop attempts with threshold=3, verifying block→block→approve.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix: enforce review and outside-in testing steps in multitask workstreams (#2925) (#2930)

* fix: enforce review and outside-in testing steps in multitask workstreams (#2925)

Root causes:
1. smart-orchestrator.yaml: validate-outside-in-testing had a condition
   requiring 'pull/' in round_1_result. For parallel/multitask workstreams,
   round_1_result is the orchestrator report (no PR URLs), so this step was
   permanently SKIPPED for all multi-workstream executions.

2. default-workflow.yaml: step-17a-compliance-verification only echoed
   instructions without checking local_testing_gate — it was a no-op that
   never blocked review when step-13 (outside-in testing) was skipped.

Fixes:
- smart-orchestrator.yaml: Remove 'pull/' URL check from condition.
  Validation now triggers for any Development task with results, letting
  the reviewer agent determine whether outside-in testing was done.
  Updated prompt handles both single-workstream (PR URLs visible) and
  parallel workstream (orchestrator report format) execution modes.

- default-workflow.yaml: step-17a now reads {{local_testing_gate}} and
  exits 1 if empty, hard-blocking the review phase until step-13 is done.

Tests:
- tests/outside_in/test_multitask_mandatory_review_steps.py (14 tests)
  * Verifies condition triggers for parallel orchestrator reports
  * Verifies condition still skips Q&A/Operations tasks
  * Verifies step-17a exits non-zero when testing gate is empty
  * Verifies step-17a succeeds when testing gate is populated
  * Verifies both recipe copies are in sync

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix: use stdin for large payloads in recipe summarize step (#2921) (#2931)

* fix: use stdin for large payloads in recipe summarize step (#2921)

Prevent "Argument list too long" error when processing large workstreams
by passing data via stdin instead of command-line arguments in the
CLI subprocess adapter.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix: remove CLAUDECODE env var detection, centralize stripping (#2883)

* fix: remove CLAUDECODE env var detection, centralize env stripping

The Claude Code binary sets CLAUDECODE to block nested sessions, but we
always want nested sessions to work. This change:

- Removes CLAUDECODE-based adapter selection logic from get_adapter()
- Creates centralized build_child_env() in adapters/env.py that strips
  CLAUDECODE from all child processes in one place
- Removes `unset CLAUDECODE` from shell scripts (Python adapters handle it)
- Updates tests to verify env stripping without depending on detection
- Updates documentation to remove CLAUDECODE as a user-facing concern

Follow-up to #2845 quality improvements.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* investigate: dual status classifiers — DO-NOT-RECOMMEND unification (#2898) (#2933)

* investigate: add recommendation doc and tests for dual status classifiers (#2898)

Investigation finding: DO-NOT-RECOMMEND unifying the two classifiers.
WorkflowClassifier (pre-action routing, string input) and SessionDetectionMixin
(post-hoc enforcement, transcript input) serve fundamentally different purposes
with different input types, output taxonomies, and consumers.

Adds outside-in regression tests that guard against unbounded keyword drift.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: trigger full CI pipeline

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* feat: parallel VM polling with ThreadPoolExecutor (#2896) (#2934)

* feat: implement parallel VM polling with ThreadPoolExecutor (#2896)

Add poll_vm_statuses() to Orchestrator and refresh_pool_statuses() to
VMPoolManager. VM status polling previously had no parallel mechanism;
this adds concurrent polling via ThreadPoolExecutor to dramatically
reduce latency when checking many VMs simultaneously.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: trigger full CI pipeline

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix: make .claude/tools/amplihack/hooks/ canonical, amplifier-bundle a symlink (#2881) (#2935)

* fix: make .claude/tools/amplihack/hooks/ canonical, amplifier-bundle a symlink (#2881)

Replace amplifier-bundle/tools/amplihack/hooks/ directory with a symlink
to .claude/tools/amplihack/hooks/, making .claude/ the single source of truth.

Changes:
- Convert amplifier-bundle/tools/amplihack/hooks/ from a directory to a
  symlink pointing to ../../../.claude/tools/amplihack/hooks/
- Move 10 files that existed only in amplifier-bundle/hooks/ to .claude/hooks/:
  dev_intent_router.py, templates/routing_prompt.txt, and 8 test files
- Update test_main_branch_protection.py to verify symlink structure instead
  of checking byte-identity of two separate copies
- Update docs/features/main-branch-protection.md to reflect symlink architecture
- Add outside-in tests (tests/outside_in/test_hooks_canonical_location.py)
  verifying the symlink structure from a user's perspective

The build system (build_hooks.py) already uses symlinks=True in shutil.copytree,
so the symlink is preserved correctly in wheel builds.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

* docs: note canonical location in hooks README

Add note that .claude/tools/amplihack/hooks/ is the canonical source
and amplifier-bundle/tools/amplihack/hooks/ is a symlink to it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* feat: add allow-list for safe send_input patterns (#2903) (#2936)

* feat: add allow-list for safe send_input patterns (#2903)

Implements a configurable allow-list mechanism for the send_input action
used in gadugi-agentic-test YAML scenarios.  Safe patterns (y, n, Enter,
quit, exit) are accepted without confirmation; arbitrary values require
explicit opt-in via confirm=True (--confirm flag).

New module: src/amplihack/testing/send_input_allowlist.py
  - DEFAULT_SAFE_PATTERNS frozenset of common interaction responses
  - ALLOWLIST_ENV_VAR for loading extra patterns from a JSON file
  - UnsafeInputError raised on disallowed values
  - is_safe_pattern() / validate_send_input() / validate_scenario_send_inputs()

47 outside-in tests added in tests/outside_in/test_safe_send_input_allowlist.py

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: trigger full CI pipeline

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* feat: add --port flag to azlin for bastion tunnel reuse (#2897) (#2937)

* feat: add --port flag to azlin for bastion tunnel reuse (#2897)

Reduces connection overhead by allowing SSH sessions to reuse an
existing bastion tunnel instead of creating a new one per command.

Changes:
- VMOptions.tunnel_port: new optional int field
- Executor.__init__ accepts tunnel_port; _azlin_port_args() helper
  injects --port <N> into all azlin subcommands (cp, connect, ssh)
- SessionManager._execute_ssh_command accepts tunnel_port parameter
- CLI: --port option added to both 'exec' and 'start' commands

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: trigger full CI pipeline

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* feat: split fleet CLI - extract _cli_formatters from _cli_session_ops (#2900) (#2938)

* feat: split fleet CLI - extract formatters module from session ops (#2900)

- Create src/amplihack/fleet/ package with:
  - _cli_formatters.py: ScoutResult, AdvanceResult dataclasses and
    format_scout_report/format_advance_report functions (224 LOC)
  - _cli_session_ops.py: Fleet session lifecycle management, run_scout,
    run_advance - imports formatters, stays under 400 LOC (366 LOC)
  - __init__.py: Clean public API re-exports
- Add outside-in tests (38 tests, 100% pass) covering:
  - Session lifecycle (start, stop, list, status)
  - Scout and advance agent operations
  - All three output formats (table, json, yaml)
  - Module separation verification

Addresses issue #2900: extract format_scout_report and format_advance_report
into separate _cli_formatters.py, keeping session ops focused.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: trigger full CI pipeline

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix: use run-script file for tmux eval pipeline to fix quoting issues (#2922) (#2939)

* fix: use run-script file for tmux eval pipeline to fix quoting issues (#2922)

The execute_remote_tmux method previously embedded the decoded prompt
directly in a tmux send-keys command:

    tmux send-keys -t SESSION "amplihack ... -p \"$PROMPT\"" C-m

When $PROMPT contained double quotes, dollar signs, backticks, or other
shell-special characters, the shell inside the tmux pane misinterpreted
the command, breaking end-to-end automation.

Fix: write a self-contained run script using a heredoc with a
single-quoted delimiter ('AMPLIHACK_RUN_EOF'), which prevents the outer
shell from expanding $(...) and $VARIABLE. Python f-string substitution
inserts literal base64 values before the shell processes the heredoc.
The script decodes both the prompt and API key from base64 at execution
time inside the tmux session, using properly-quoted "$PROMPT" to pass
the value as a single argument regardless of content.

Changes:
- Replace 4 fragile send-keys lines with a clean run-script approach in
  execute_remote_tmux() across all three executor.py copies
- Add 32 outside-in tests in tests/outside_in/test_eval_pipeline_tmux.py
  covering special character safety, heredoc quoting, API key encoding,
  tmux session creation, bash syntax validity, and regression prevention

All 19 existing unit tests and 32 new outside-in tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: trigger full CI pipeline

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* feat: cache SSH output between discovery and reasoning phases (#2899) (#2942)

* feat: cache SSH output between discovery and reasoning phases (#2899)

Add TTL-based SSH output cache to SessionManager so repeated
capture_output() calls within the TTL window reuse cached results
instead of re-running SSH commands. Reduces SSH overhead during
the discovery→reasoning transition.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* ci: trigger full CI pipeline

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: Oxidizer recipe — automated Python-to-Rust migration workflow (#2950)

* feat: Oxidizer recipe — automated Python-to-Rust migration workflow

Adds the oxidizer-workflow recipe, skill definition, and documentation.

- amplifier-bundle/recipes/oxidizer-workflow.yaml: 65-step recipe with
  iterative convergence loops, quality audits, and zero-tolerance parity
- .claude/skills/oxidizer-workflow/SKILL.md: Skill definition with activation
  keywords and usage examples
- docs/OXIDIZER.md: Full documentation covering all phases, context variables,
  and the zero-tolerance policy
- mkdocs.yml: Navigation entries for the workflow and skill

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@devo.xh24nwhiyviedbtbx54dafh01e.dx.internal.cloudapp.net>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* feat: integrate /top5 priority aggregation into pm-architect (#2941)

* feat: integrate /top5 priority aggregation into pm-architect

Merge top5 priority aggregation directly into pm-architect rather than
as a standalone skill. Adds Pattern 5 (Quick Priority View) and Pattern 6
(Daily Standup) to the orchestrator.

Files added/modified:
- scripts/generate_top5.py: Aggregates priorities across backlog-curator,
  workstream-coordinator, roadmap-strategist, and work-delegator into a
  strict Top 5 ranked list (weights: 35/25/25/15)
- scripts/tests/test_generate_top5.py: 31 unit tests covering extraction,
  aggregation, ranking, tiebreaking, and edge cases
- scripts/tests/conftest.py: Added pm_dir, sample_backlog_items, and
  populated_pm fixtures for top5 testing
- SKILL.md: Added /top5 trigger, Pattern 5 and 6, updated scripts list

Closes milestone 1 of #2932

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: GitHub-native data sourcing for /top5

Rewrites generate_top5.py to query live GitHub data instead of reading
static .pm/ YAML files. Sources issues and PRs across multiple GitHub
accounts via gh CLI search API.

- Reads .pm/sources.yaml for account/repo configuration
- Fetches open issues and PRs via gh api search/issues
- Scores by: label priority, staleness, comment activity, draft status
- Falls back to .pm/backlog/ for local overrides when GitHub unavailable
- Restores original gh account after multi-account queries
- Weights: issues 40%, PRs 30%, roadmap alignment 20%, local 10%

Tested live: 116 candidates from rysweet + rysweet_microsoft accounts
29 unit tests passing (mocked gh calls + aggregation logic)

Implements Milestone 0 of #2932

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address review findings in /top5 implementation

- Remove dead `env` variable and unused comment in run_gh()
- Simplify get_current_gh_account() from fragile stderr parsing to `gh api user --jq`
- Fix lstrip("-* ") bug in extract_roadmap_goals() using removeprefix()
- Rename shadow variable `l` to `lbl` in list comprehensions for readability
- Fix SKILL.md weight documentation to match actual code (40/30/20/10)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

* test: add outside-in agentic test scenarios for /top5 trigger

Five gadugi-agentic-test YAML scenarios covering:
- Smoke test: valid JSON output with expected keys
- GitHub source aggregation: end-to-end with real sources.yaml
- Error handling: malformed YAML, missing dirs, empty sources
- Local overrides: .pm/backlog + roadmap alignment scoring
- Ranking enforcement: top-5 limit, rank fields 1-5, score ordering

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: enriched /top5 output with score breakdown, actions, near-misses, and repo summary

The previous output was a flat ranked list with no context for decision-making.
Now includes:

- Score breakdown per item (label_priority, staleness, activity components)
- Suggested action per item ("Merge, close, or rebase", "Fix immediately", etc.)
- Near-misses (items #6-#10 that just missed the cut)
- Per-repo summary (issue/PR counts, high-priority counts)
- Per-account summary (total work across repos)
- Full metadata preserved (labels, dates, days_stale, draft status, comments)

40 tests passing (up from 29), covering new features:
- suggest_action logic for all source types
- near_misses return from aggregate_and_rank
- build_repo_summary grouping and counting

Refs #2932

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sync skill mirrors and remove ephemeral .pm/ state files

- Sync all new pm-architect files to amplifier-bundle/skills/ and docs/claude/skills/
  (fixes "Check skill/agent drift" CI failure)
- Remove .pm/ ephemeral state files (backlog, workstreams, delegations, config)
  that Repo Guardian correctly flagged as point-in-time state
- Add .pm/ to .gitignore to prevent future accidental commits

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

* feat: Oxidizer recipe — automated Python-to-Rust migration workflow (#2950)

* feat: Oxidizer recipe — automated Python-to-Rust migration workflow

Adds the oxidizer-workflow recipe, skill definition, and documentation.

- amplifier-bundle/recipes/oxidizer-workflow.yaml: 65-step recipe with
  iterative convergence loops, quality audits, and zero-tolerance parity
- .claude/skills/oxidizer-workflow/SKILL.md: Skill definition with activation
  keywords and usage examples
- docs/OXIDIZER.md: Full documentation covering all phases, context variables,
  and the zero-tolerance policy
- mkdocs.yml: Navigation entries for the workflow and skill

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@devo.xh24nwhiyviedbtbx54dafh01e.dx.internal.cloudapp.net>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ubuntu <azureuser@devy.yb0a3bvkdghunmsjr4s3fnfhra.phxx.internal.cloudapp.net>
Co-authored-by: Ubuntu <azureuser@devo.xh24nwhiyviedbtbx54dafh01e.dx.internal.cloudapp.net>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat: Rust recipe runner integration with engine selection (#2951)

Replaces Python recipe runner with Rust implementation from rysweet/amplihack-recipe-runner.

- Engine selection via RECIPE_RUNNER_ENGINE env var (rust/python/auto-detect)
- Auto-installs via cargo on first use (ensure_rust_recipe_runner)
- Nested session depth enforcement (AMPLIHACK_MAX_DEPTH)
- Non-interactive footer for autonomous agent execution
- Configurable timeouts via env vars
- Context value redaction in logs
- 34 tests covering all paths

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: classic mode launcher hangs indefinitely due to multi-line -p arg (#2958)

* fix: put classic launcher -p argument on single line to prevent hang (#2946)

The classic mode launcher wrote a multi-line -p argument that the shell
split at newlines, causing amplihack claude to wait on stdin indefinitely.
Collapse the prompt onto a single line so the entire -p value is passed
as one argument.

Add 3 regression tests verifying no newlines in the -p argument.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* [skip ci] chore: Auto-bump patch version

---------

Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix: ensure Rust recipe runner on startup, add cargo to prerequisites (#2957)

* fix: ensure Rust recipe runner on startup, add cargo to prerequisites

- Add ensure_rust_recipe_runner() call to copilot launcher startup
- Add Rust/cargo to Prerequisites in README.md and PREREQUISITES.md
- Add cargo install instructions to all platform sections (macOS, Ubuntu, Fedora, Arch)
- Add cargo --version to verification commands

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat: add Rust recipe runner check to all startup paths

Move ensure_rust_recipe_runner() from copilot-only to a shared
_ensure_rust_recipe_runner() function called from all 6 launcher
entry points: launch, claude, RustyClawd, copilot, codex, amplifier.

Previously only copilot and install had the check, leaving 4 paths
uncovered.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: consolidate launcher startup into _common_launcher_startup()

Extract nesting detection, framework staging, Rust recipe runner check,
SDK dep check, and power-steering prompt into a single idempotent
function called from all 5 launcher entry points.

Before: launch_command() had 7 init steps; copilot/codex/amplifier
only had staging + rust runner. Now all paths get identical init.

Also fixes 6 pre-existing test failures in test_cli_claude_command_guard
by mocking _common_launcher_startup (staging sys.exit(1) was leaking
through the test harness).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test: add 19 tests for _common_launcher_startup()

Covers:
- Idempotency guard (double-call safe for RustyClawd → launch_command)
- subprocess_safe skip
- Nesting detection and auto-staging
- Startup steps order (staged → rust → sdk → power-steering)
- Non-fatal failure handling for SDK deps and power-steering
- _ensure_rust_recipe_runner output (success, warning, import error)
- All 6 launcher paths call _common_launcher_startup

Outside-in verified: each launcher command (launch, claude, RustyClawd,
copilot, codex, amplifier) shows 'Rust recipe runner available' in real
subprocess output.

Co-authored-by: Copilot <223556219+Copilo…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant